IEICE global.ieice.org Site

Keyword Search Result

[Keyword] interconnection network(96hit)

41-60hit(96hit)

The Spanning Connectivity of the Burnt Pancake Graphs
Cherng CHIN Tien-Hsiung WENG Lih-Hsing HSU Shang-Chia CHIOU

PAPER-Algorithm Theory

Vol:
E92-D No:3
Page(s):
389-400
Let u and v be any two distinct vertices of an undirected graph G, which is k-connected. For 1 w k, a w-container C(u, v) of a k-connected graph G is a set of w-disjoint paths joining u and v. A w-container C(u, v) of G is a w *-container if it contains all the vertices of G. A graph G is w *-connected if there exists a w *-container between any two distinct vertices. Let κ(G) be the connectivity of G. A graph G is super spanning connected if G is i *-connected for 1 i κ(G). In this paper, we prove that the n-dimensional burnt pancake graph Bn is super spanning connected if and only if n ≠ 2.
Evaluation of Interconnect-Complexity-Aware Low-Power VLSI Design Using Multiple Supply and Threshold Voltages
Hasitha Muthumala WAIDYASOORIYA Masanori HARIYAMA Michitaka KAMEYAMA

PAPER-High-Level Synthesis and System-Level Design

Vol:
E91-A No:12
Page(s):
3596-3606
This paper presents a high-level synthesis approach to minimize the total power consumption in behavioral synthesis under time and area constraints. The proposed method has two stages, functional unit (FU) energy optimization and interconnect energy optimization. In the first stage, active and inactive energies of the FUs are optimized using a multiple supply and threshold voltage scheme. Genetic algorithm (GA) based simultaneous assignment of supply and threshold voltages and module selection is proposed. The proposed GA based searching method can be used in large size problems to find a near-optimal solution in a reasonable time. In the second stage, interconnects are simplified by increasing their sharing. This is done by exploiting similar data transfer patterns among FUs. The proposed method is evaluated for several benchmarks under 90 nm CMOS technology. The experimental results show that more than 40% of energy savings can be achieved by our proposed method.
Hamiltonian Cycles and Hamiltonian Paths in Faulty Burnt Pancake Graphs
Keiichi KANEKO

PAPER-Algorithm Theory

Vol:
E90-D No:4
Page(s):
716-721
Recently, research on parallel processing systems is very active, and many complex topologies have been proposed. A burnt pancake graph is one such topology. In this paper, we prove that a faulty burnt pancake graph with degree n has a fault-free Hamiltonian cycle if the number of the faulty elements is n-2 or less, and it has a fault-free Hamiltonian path between any pair of nonfaulty nodes if the number of the faulty elements is n-3 or less.
Rearrangeability of Tandem Cascade of Banyan-Type Networks
Xuesong TAN Shuo-Yen Robert LI

PAPER-Rearrangeable Network

Vol:
E90-D No:1
Page(s):
67-74
The cascade of two baseline networks in tandem is a rearrangeable network. The cascade of two omega networks appended with a certain interconnection pattern is also rearrangeable. These belong to the general problem: for what banyan-type network (i.e., bit-permuting unique-routing network) is the tandem cascade a rearrangeable network? We relate the problem to the trace and guide of banyan-type networks. Let τ denote the trace permutation of a 2n2n banyan-type network and γ the guide permutation of it. This paper proves that rearrangeability of the tandem cascade of the network is solely determined by the transposition τγ-1. Such a permutation is said to be tandem rearrangeable when the tandem cascade is indeed rearrangeable. We identify a few tandem rearrangeable permutations, each implying the rearrangeability of the tandem cascade of a wide class of banyan-type networks.
Node-Disjoint Paths Algorithm in a Transposition Graph
Yasuto SUZUKI Keiichi KANEKO Mario NAKAMORI

PAPER-Algorithm Theory

Vol:
E89-D No:10
Page(s):
2600-2605
In this paper, we give an algorithm for the node-to-set disjoint paths problem in a transposition graph. The algorithm is of polynomial order of n for an n-transposition graph. It is based on recursion and divided into two cases according to the distribution of destination nodes. The maximum length of each path and the time complexity of the algorithm are estimated theoretically to be O(n7) and 3n - 5, respectively, and the average performance is evaluated based on computer experiments.
Linear Layout of the Supercube
Jywe-Fei FANG

PAPER-Network

Vol:
E89-D No:2
Page(s):
779-782
In this paper, we study the layout problem of the supercube by embedding-in-book technique. The supercube, a generalization of the hypercube, can be constructed for an arbitrary system size. Moreover, it has the same diameter and connectivity of the corresponding hypercube. Embedding a graph in book is to place nodes along the spine of the book and to draw the edges such that edges residing in a page do not cross. An n-dimensional hypercube is regular because the degree of each node is n. The corresponding supercube with N nodes, where 2n-1 < N 2n and N ≠ 32n-2, is irregular because the degree of each node ranges from 2n - 2 to n - 1. Although Chung et al. have shown that an n-dimensional hypercube can be embedded with n - 1 pages, to lay out the supercube remains a challenging problem owing to its higher degree and irregular structure. In this paper, we show that a supercube of N nodes, where 2n-1 < N 2n, can also be embedded with n - 1 pages and book width N - 2n-2 for 2n-1 < N < 32n-2, and 2n-1 for 32n-2 N 2n.
A Note on the Implementation of de Bruijn Networks by the Optical Transpose Interconnection System
Kohsuke OGATA Toshinori YAMADA Shuichi UENO

LETTER-Graphs and Networks

Vol:
E88-A No:12
Page(s):
3661-3662
This note shows an efficient implementation of de Bruijn networks by the Optical Transpose Interconnection System (OTIS) extending previous results by Coudert, Ferreira, and Perennes [2].
Torus Ring: Improving Interconnection Network Performance by Modifying Hierarchical Ring
Jong Wook KWAK Hyong Jin BAN Chu Shik JHON

LETTER-Computer Systems

Vol:
E88-D No:5
Page(s):
1067-1071
In this letter, we propose "Torus Ring", which is a modified version of 2-level hierarchical ring. The Torus Ring has the same complexity as the hierarchical rings, since the only difference is the way it connects the local rings. It has an advantage over the hierarchical ring when the destination of a packet is the adjacent local ring, especially to the backward direction. Although we assume that the destination of a network packet is uniformly distributed across the processing nodes, the average number of hops in Torus Ring is equal to that of the hierarchical ring. However, the performance gain of the Torus Ring is expected to increase, due to the spatial locality of the application programs in the real parallel programming environment. In the simulation results, latencies of the interconnection network are reduced by up to 19%, with moderate ring utilization ratios.
MMLRU Selection Function: A Simple and Efficient Output Selection Function in Adaptive Routing
Michihiro KOIBUCHI Akiya JOURAKU Hideharu AMANO

PAPER-Computer Systems

Vol:
E88-D No:1
Page(s):
109-118
Adaptive routing algorithms, which dynamically select the route of a packet, have been widely studied for interconnection networks in massively parallel computers. An output selection function (OSF), which decides the output channel when some legal channels are free, is essential for an adaptive routing. In this paper, we propose a simple and efficient OSF called minimal multiplexed and least-recently-used (MMLRU). The MMLRU selection function has the following simple strategies for distributing the traffic: 1) each router locally grasps the congestion information by the utilization ratio of its own physical channels; 2) it is divided into the two selection steps, the choice from available physical channels and the choice from available virtual channels. The MMLRU selection function can be used on any type of network topology and adaptive routing algorithm. Simulation results show that the MMLRU selection function improves throughput and latency especially when the number of dimension becomes larger or the number of nodes per dimension become larger.
Utilization of the On-Chip L2 Cache Area in CC-NUMA Multiprocessors for Applications with a Small Working Set
Sung Woo CHUNG Hyong-Shik KIM Chu Shik JHON

PAPER-Networking and System Architectures

Vol:
E87-D No:7
Page(s):
1617-1624
In CC-NUMA multiprocessor systems, it is important to reduce the remote memory access time. Based upon the fact that increasing the size of the LRU second-level (L2) cache more than a certain value does not reduce the cache miss rate significantly, in this paper, we propose two split L2 caches to utilize the surplus of the L2 cache. The split L2 caches are composed of a traditional LRU cache and another cache to reduce the remote memory access time. Both work together to reduce total L2 cache miss time by keeping remote (or long-distance) blocks as well as recently used blocks. For another cache, we propose two alternatives: an L2-RVC (Level 2 - Remote Victim Cache) and an L2-DAVC (Level 2 - Distance-Aware Victim Cache). The proposed split L2 caches reduce total execution time by up to 27%. It is also found that the proposed split L2 caches outperform the traditional single LRU cache of double size.
Speculative Selection Routing in 2D Torus Network
Tran CONG SO Shigeru OYANAGI Katsuhiro YAMAZAKI

PAPER-Networking and System Architectures

Vol:
E87-D No:7
Page(s):
1666-1673
We have proposed a speculative selection function for adaptive routing, which uses idle cycles of the network physical links to exchange network information between nodes, thus helps to decide the best selection. Previous study on the mesh network showed that SSR gives message selection flexibility that improves network performance by balancing the network traffic in both global and local scopes. This paper evaluates the speculative selection function on 2D torus network with simulation. The simulation compares the network throughput and latency with various traffic patterns. The visualization graphs show how the speculative selection eliminates hotspots and disperses traffic in the global scope. The simulation results demonstrate that by using speculative selection, the network performance is increased by around 7%. Compared to the mesh network, the torus's version has smaller gain due to the high performance nature of the torus network.
Hierarchical Interconnection Networks Based on (3, 3)-Graphs for Massively Parallel Processors
Gene Eu JAN Yuan-Shin HWANG

PAPER-Networking and System Architectures

Vol:
E87-D No:7
Page(s):
1649-1656
This paper proposes several novel hierarchical interconnection networks based on the (3, 3)-graphs, namely folded (3, 3)-networks, root-folded (3, 3)-networks, recursively expanded (3, 3)-networks, and flooded (3, 3)-networks. Just as the hypercubes, CCC, Peterson-based networks, and Heawood-based networks, these hierarchical networks have the following nice properties: regular topology, high scalability, and small diameters. Due to these important properties, these hierarchical networks seem to have the potential as alternatives for the future interconnection structures of multicomputer systems, especially massively parallel processors (MPPs). Furthermore, this paper will present the routing and broadcasting algorithms for these proposed networks to demonstrate that these algorithms are as elegant as the algorithms for hypercubes, CCC, and Petersen- or Heawood-based networks.
A Dynamically Adaptive Hardware on Dynamically Reconfigurable Processor
Hideharu AMANO Akiya JOURAKU Kenichiro ANJO

INVITED PAPER

Vol:
E86-B No:12
Page(s):
3385-3391
A framework of dynamically adaptive hardware mechanism on multicontext reconfigurable devices is proposed, and as an example, an adaptive switching fabric is implemented on NEC's novel reconfigurable device DRP (Dynamically Reconfigurable Processor). In this switch, contexts for the full crossbar and alternative hadware modules, which provide larger bandwidth but can treat only a limited pattern of packet inputs, are prepared. Using the quick context switching functionality, a context for the full crossbar is replaced by alternative contexts according to the packet inputs pattern. If the context corresponding to requested alternative hadware modules is not inside the chip, it is loaded from outside chip to currently unused context memory, then replaced with the full size crossbar. If the traffic includes a lot of packets for specific destinations, a set of contexts frequently used in the traffic is gathered inside the chip like a working set stored in a cache. 4 4 mesh network connected with the proposed adaptive switches is simulated, and it appears that the latency between nodes is improved three times when the traffic between neighboring four nodes is dominant.
Minimum Feedback Node Sets in Trivalent Cayley Graphs
Yasuto SUZUKI Keiichi KANEKO

LETTER

Vol:
E86-D No:9
Page(s):
1634-1636
A minimum feedback node set in a graph is a minimum node subset whose deletion makes the graph acyclic. Its detection in a dependency graph is important to recover from a deadlock configuration. A livelock configuration is also avoidable if a check point is set in each node in the minimum feedback node set. Hence, its detection is very important to establish dependable network systems. In this letter, we give a minimum feedback node set in a trivalent Cayley graph. Assuming that each word has n bits, for any node, we can judge if it is included in the set or not in constant time.
Node-to-Set Disjoint Paths Problem in Pancake Graphs
Keiichi KANEKO Yasuto SUZUKI

PAPER-Algorithms and Applications

Vol:
E86-D No:9
Page(s):
1628-1633
In this paper, we give an algorithm for the node-to-set disjoint paths problem in pancake graphs with its evaluation results. The algorithm is of polynomial order of n for an n-pancake graph. It is based on recursion and divided into two cases according to the distribution of destination nodes in classes into which all the nodes in a pancake graph are categorized. The sum of lengths of paths obtained and the time complexity of the algorithm are estimated and the average performance is evaluated based on computer simulation.
An Algorithm for Node-Disjoint Paths in Pancake Graphs
Yasuto SUZUKI Keiichi KANEKO

PAPER-Algorithms

Vol:
E86-D No:3
Page(s):
610-615
For any pair of distinct nodes in an n-pancake graph, we give an algorithm for construction of n-1 internally disjoint paths connecting the nodes in the time complexity of polynomial order of n. The length of each path obtained and the time complexity of the algorithm are estimated theoretically and verified by computer simulation.
Escape and Restoration Routing: Suspensive Deadlock Recovery in Interconnection Networks
Toshinori TAKABATAKE Masato KITAKAMI Hideo ITO

PAPER-Computer Systems

Vol:
E85-D No:5
Page(s):
824-832
In interconnection networks, deadlock recovery has been studied in routing strategy. The routing strategy for the deadlock recovery is intended to optimize the routing performance when deadlocks do not occur. On the other hand, it is important to improve the routing performance by handling deadlocks if they occur. In this paper, a routing strategy for suspensive deadlock recovery called an escape-restoration routing is proposed and its performance is evaluated. In the principle of the proposed techniques, a small amount of exclusive buffer (escape-buffer) at each router is prepared for handling one of deadlocked packets. The transmission of the packet is suspended by temporarily escaping it to the escape-buffer. After the other deadlocked packets were sent, the suspended transmission resumes by restoring the escaped packet. Evaluation results show that the proposed techniques can improve the routing performance more than that of the previous recovery-based techniques in handling deadlocks.
A Generalized Processor Allocation Scheme for Recursively Decomposable Interconnection Networks
Fan WU Ching-Chi HSU

PAPER-Computer Systems

Vol:
E85-D No:4
Page(s):
694-713
The Recursively Decomposable Interconnection Network (RDIN) is a set of interconnection networks that can be recursively decomposed into smaller substructures whose topologies and properties are similar to the original one. The examples of the RDIN are hypercubes, star graph, mesh, tree, pyramid, pancake, and WK-recursive network. This paper proposed a uniform and simple model to represent the RDIN inside computers at first. Based on the model, a generalized and efficient allocation scheme capable of being applied to all the members of the RDIN is developed. The proposed scheme can fully recognize the substructures (such as subcube, substar, subtree,. . . ) more easily than ever, and it is the first one that can fully recognize all the incomplete substructures. The best-fit allocation is also proposed. The criterion aims at keeping the largest free parts from being destroyed, as is the philosophy of the best-fit allocation. Moreover, the proposed scheme can be performed in an injured RDIN with its processors and/or links faulty. Finally, the mathematical analysis and simulations for two instances, hypercubes and star graphs, of the RDIN are presented. The results show that the generalized scheme outperforms or is comparable to the other proprietary allocation schemes designed for the specific structure.
Fault-Tolerant Ring- and Toroidal Mesh-Connected Processor Arrays Able to Enhance Emulation of Hypercubes
Nobuo TSUDA

PAPER

Vol:
E84-D No:11
Page(s):
1452-1461
An advanced spare-connection scheme for K-out-of-N redundancy is proposed for constructing fault-tolerant ring- or toroidal mesh-connected processing-node arrays able to enhance emulation of binary hypercubes by using bypass networks. With this scheme, a component redundancy configuration for a base array with a fixed number of primary nodes, such as that for 8-node ring or 32-node toroidal mesh, can be constructed by using bypass links with a segmented bus structure to selectively connect the primary nodes to a spare node in parallel. These bypass links are allocated to the primary nodes by graph-node coloring with a minimum inter-node distance of three in order to use the bypass links as the hypercube connections as well as to attain strong fault tolerance for reconfiguring the base array with the primary network topology. An extended redundancy configuration for a large fault-tolerant array can be constructed by connecting the component configurations by using external switches of a hub type provided at the bus nodes of the bypass links. This configuration has a network topology of the parallel star-connections of sub-hypercubes whose diameter is smaller than that of the regular hypercube.
Design of Fault Tolerant Multistage Interconnection Networks with Dilated Links
Naotake KAMIURA Takashi KODERA Nobuyuki MATSUI

PAPER

Vol:
E84-D No:11
Page(s):
1500-1507
In this paper we propose a MIN (Multistage Interconnection Network) whose performance in the faulty case degrades as gracefully as possible. We focus on a two-dilated baseline network as a sort of MIN. The link connection pattern in our MIN is determined so that all the available paths established between an input terminal and an output terminal via an identical input of a SE (Switching Element) in some stage will never pass through an identical SE in the next stage. Extra links are useful in improving the performance of the MIN and do not complicate the routing scheme. There is no difference between our MIN and others constructed from a baseline network with regard to numbers of links and cross points in all SEs. The theoretical computation and simulation-based study show that our MIN is superior to others in performance, especially in robustness against concentrated SE faults in an identical stage.

41-60hit(96hit)

Keyword Search Result

[Keyword] interconnection network(96hit)

The Spanning Connectivity of the Burnt Pancake Graphs

Evaluation of Interconnect-Complexity-Aware Low-Power VLSI Design Using Multiple Supply and Threshold Voltages

Hamiltonian Cycles and Hamiltonian Paths in Faulty Burnt Pancake Graphs

Rearrangeability of Tandem Cascade of Banyan-Type Networks

Node-Disjoint Paths Algorithm in a Transposition Graph

Linear Layout of the Supercube

A Note on the Implementation of de Bruijn Networks by the Optical Transpose Interconnection System

Torus Ring: Improving Interconnection Network Performance by Modifying Hierarchical Ring

MMLRU Selection Function: A Simple and Efficient Output Selection Function in Adaptive Routing

Utilization of the On-Chip L2 Cache Area in CC-NUMA Multiprocessors for Applications with a Small Working Set

Speculative Selection Routing in 2D Torus Network

Hierarchical Interconnection Networks Based on (3, 3)-Graphs for Massively Parallel Processors

A Dynamically Adaptive Hardware on Dynamically Reconfigurable Processor

Minimum Feedback Node Sets in Trivalent Cayley Graphs

Node-to-Set Disjoint Paths Problem in Pancake Graphs

An Algorithm for Node-Disjoint Paths in Pancake Graphs

Escape and Restoration Routing: Suspensive Deadlock Recovery in Interconnection Networks

A Generalized Processor Allocation Scheme for Recursively Decomposable Interconnection Networks

Fault-Tolerant Ring- and Toroidal Mesh-Connected Processor Arrays Able to Enhance Emulation of Hypercubes

Design of Fault Tolerant Multistage Interconnection Networks with Dilated Links

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles